Goto

Collaborating Authors

 Chautauqua County


"I'm fully who I am": Towards Centering Transgender and Non-Binary Voices to Measure Biases in Open Language Generation

Ovalle, Anaelia, Goyal, Palash, Dhamala, Jwala, Jaggers, Zachary, Chang, Kai-Wei, Galstyan, Aram, Zemel, Richard, Gupta, Rahul

arXiv.org Artificial Intelligence

Transgender and non-binary (TGNB) individuals disproportionately experience discrimination and exclusion from daily life. Given the recent popularity and adoption of language generation technologies, the potential to further marginalize this population only grows. Although a multitude of NLP fairness literature focuses on illuminating and addressing gender biases, assessing gender harms for TGNB identities requires understanding how such identities uniquely interact with societal gender norms and how they differ from gender binary-centric perspectives. Such measurement frameworks inherently require centering TGNB voices to help guide the alignment between gender-inclusive NLP and whom they are intended to serve. Towards this goal, we ground our work in the TGNB community and existing interdisciplinary literature to assess how the social reality surrounding experienced marginalization of TGNB persons contributes to and persists within Open Language Generation (OLG). This social knowledge serves as a guide for evaluating popular large language models (LLMs) on two key aspects: (1) misgendering and (2) harmful responses to gender disclosure. To do this, we introduce TANGO, a dataset of template-based real-world text curated from a TGNB-oriented community. We discover a dominance of binary gender norms reflected by the models; LLMs least misgendered subjects in generated text when triggered by prompts whose subjects used binary pronouns. Meanwhile, misgendering was most prevalent when triggering generation with singular they and neopronouns. When prompted with gender disclosures, TGNB disclosure generated the most stigmatizing language and scored most toxic, on average. Our findings warrant further research on how TGNB harms manifest in LLMs and serve as a broader case study toward concretely grounding the design of gender-inclusive AI in community voices and interdisciplinary literature.


Learning from Dialogue after Deployment: Feed Yourself, Chatbot!

Hancock, Braden, Bordes, Antoine, Mazare, Pierre-Emmanuel, Weston, Jason

arXiv.org Machine Learning

The majority of conversations a dialogue agent sees over its lifetime occur after it has already been trained and deployed, leaving a vast store of potential training signal untapped. In this work, we propose the self-feeding chatbot, a dialogue agent with the ability to extract new training examples from the conversations it participates in. As our agent engages in conversation, it also estimates user satisfaction in its responses. When the conversation appears to be going well, the user's responses become new training examples to imitate. When the agent believes it has made a mistake, it asks for feedback; learning to predict the feedback that will be given improves the chatbot's dialogue abilities further. On the PersonaChat chit-chat dataset with over 131k training examples, we find that learning from dialogue with a self-feeding chatbot significantly improves performance, regardless of the amount of traditional supervision.


Is that funny? Microsoft develops AI that can assess comedy

#artificialintelligence

Microsoft has added another type of artificial intelligence to its expanding array of machine intelligence systems (among other ventures as profiled in the article "How Microsoft's CEO Nadella has steered the company to success"). This is a form of face-scanning technology that can assess patterns of laughter and potentially assess the degree to which an event or person is found'funny' by an audience. The technology was showcased at the Laugh Battle exhibit which took place at the National Comedy Center in Jamestown, New York. Here a trial took place where the Microsoft technology was used to assess several performers and to determine which comedian was'best' at delivering their jokes based on an assessment of the reactions of the audience, according to Inverse. The reactions were processed by the platform's deep neural network, which was developed by Azure Cognitive Services.